30 research outputs found

    Analysis of a bipartite network of movie ratings and catalogue network growth models

    Get PDF
    Network science is a rapidly growing field that draws important results from mathematics, physics, computer science, sociology, and many other disciplines. There are many problems in nature and man made systems that involve interactions between large number of agents which take place over a non-trivial topology. These problems lend themselves naturally and successfully to a network representation. Of particular interest are the models that deal with growth and evolution of networks because the vast majority of the systems represented by them are not static. This work is concerned about systems with two different types of interacting constituents known as bipartite networks. This thesis is structured as follows: In Chapter 1 a network is defined as a graph and a brief introduction to the concepts used throughout this work is given. We describe the well-known network growth model of Preferential Attachment [2] and a model of the evolution of a bipartite network whose agent quantities are fixed [9]. In Chapter 2 we study data from Netflix, an online movie rental service whereby users can give ratings to movies they rent. We show how this system can be represented as a network and analyse some of its properties. The probability distribution of the number of ratings of users and movies follows a power-law distribution with an exponential cutoff, which indicates saturation in the number of ratings that a movie can receive or a user give. We also found that movies and users in the system form densely connected neighbourhoods. Chapter 3 is concerned with the development of network growth and evolution models which attempt to explain the growth and evolution of networks with saturation and a limited number of agents. We develop a network growth model in which the agents are drawn from fixed catalogues. An exact analytical solution to the model can sometimes be found, an approximate one using asymptotics in other cases and numerically in general. The results given by this model describe what is observed in simulated networks and show some of the characteristics observed in the Netflix network

    Finding role communities in directed networks using Role-Based Similarity, Markov Stability and the Relaxed Minimum Spanning Tree

    Full text link
    We present a framework to cluster nodes in directed networks according to their roles by combining Role-Based Similarity (RBS) and Markov Stability, two techniques based on flows. First we compute the RBS matrix, which contains the pairwise similarities between nodes according to the scaled number of in- and out-directed paths of different lengths. The weighted RBS similarity matrix is then transformed into an undirected similarity network using the Relaxed Minimum-Spanning Tree (RMST) algorithm, which uses the geometric structure of the RBS matrix to unblur the network, such that edges between nodes with high, direct RBS are preserved. Finally, we partition the RMST similarity network into role-communities of nodes at all scales using Markov Stability to find a robust set of roles in the network. We showcase our framework through a biological and a man-made network.Comment: 4 pages, 2 figure

    Interest communities and flow roles in directed networks: the Twitter network of the UK riots

    Full text link
    Directionality is a crucial ingredient in many complex networks in which information, energy or influence are transmitted. In such directed networks, analysing flows (and not only the strength of connections) is crucial to reveal important features of the network that might go undetected if the orientation of connections is ignored. We showcase here a flow-based approach for community detection in networks through the study of the network of the most influential Twitter users during the 2011 riots in England. Firstly, we use directed Markov Stability to extract descriptions of the network at different levels of coarseness in terms of interest communities, i.e., groups of nodes within which flows of information are contained and reinforced. Such interest communities reveal user groupings according to location, profession, employer, and topic. The study of flows also allows us to generate an interest distance, which affords a personalised view of the attention in the network as viewed from the vantage point of any given user. Secondly, we analyse the profiles of incoming and outgoing long-range flows with a combined approach of role-based similarity and the novel relaxed minimum spanning tree algorithm to reveal that the users in the network can be classified into five roles. These flow roles go beyond the standard leader/follower dichotomy and differ from classifications based on regular/structural equivalence. We then show that the interest communities fall into distinct informational organigrams characterised by a different mix of user roles reflecting the quality of dialogue within them. Our generic framework can be used to provide insight into how flows are generated, distributed, preserved and consumed in directed networks.Comment: 32 pages, 14 figures. Supplementary Spreadsheet available from: http://www2.imperial.ac.uk/~mbegueri/Docs/riotsCommunities.zip or http://rsif.royalsocietypublishing.org/content/11/101/20140940/suppl/DC

    The 'who' and 'what' of #diabetes on Twitter

    Get PDF
    Social media are being increasingly used for health promotion, yet the landscape of users, messages and interactions in such fora is poorly understood. Studies of social media and diabetes have focused mostly on patients, or public agencies addressing it, but have not looked broadly at all the participants or the diversity of content they contribute. We study Twitter conversations about diabetes through the systematic analysis of 2.5 million tweets collected over 8 months and the interactions between their authors. We address three questions: (1) what themes arise in these tweets?, (2) who are the most influential users?, (3) which type of users contribute to which themes? We answer these questions using a mixed-methods approach, integrating techniques from anthropology, network science and information retrieval such as thematic coding, temporal network analysis, and community and topic detection. Diabetes-related tweets fall within broad thematic groups: health information, news, social interaction, and commercial. At the same time, humorous messages and references to popular culture appear consistently, more than any other type of tweet. We classify authors according to their temporal 'hub' and 'authority' scores. Whereas the hub landscape is diffuse and fluid over time, top authorities are highly persistent across time and comprise bloggers, advocacy groups and NGOs related to diabetes, as well as for-profit entities without specific diabetes expertise. Top authorities fall into seven interest communities as derived from their Twitter follower network. Our findings have implications for public health professionals and policy makers who seek to use social media as an engagement tool and to inform policy design.Comment: 25 pages, 11 figures, 7 tables. Supplemental spreadsheet available from http://journals.sagepub.com/doi/suppl/10.1177/2055207616688841, Digital Health, Vol 3, 201

    Customer mobility and congestion in supermarkets

    Full text link
    The analysis and characterization of human mobility using population-level mobility models is important for numerous applications, ranging from the estimation of commuter flows in cities to modeling trade flows between countries. However, almost all of these applications have focused on large spatial scales, which typically range between intra-city scales to inter-country scales. In this paper, we investigate population-level human mobility models on a much smaller spatial scale by using them to estimate customer mobility flow between supermarket zones. We use anonymized, ordered customer-basket data to infer empirical mobility flow in supermarkets, and we apply variants of the gravity and intervening-opportunities models to fit this mobility flow and estimate the flow on unseen data. We find that a doubly-constrained gravity model and an extended radiation model (which is a type of intervening-opportunities model) can successfully estimate 65--70\% of the flow inside supermarkets. Using a gravity model as a case study, we then investigate how to reduce congestion in supermarkets using mobility models. We model each supermarket zone as a queue, and we use a gravity model to identify store layouts with low congestion, which we measure either by the maximum number of visits to a zone or by the total mean queue size. We then use a simulated-annealing algorithm to find store layouts with lower congestion than a supermarket's original layout. In these optimized store layouts, we find that popular zones are often in the perimeter of a store. Our research gives insight both into how customers move in supermarkets and into how retailers can arrange stores to reduce congestion. It also provides a case study of human mobility on small spatial scales
    corecore